Mean Squared Residue Based Biclustering Algorithms

نویسندگان

Stefan Gremalschi

Gulsah Altun

چکیده

The availability of large microarray data has brought along many challenges for biological data mining. Following Cheng and Church [4], many different biclustering methods have been widely used to find appropriate subsets of experimental conditions. Still no paper directly optimizes or bounds the Mean Squared Residue (MSR) originally suggested by Cheng and Church. Their algorithm, for a given expression matrix A and an upper bound on MSR, finds k almost non overlapping biclusters whose sizes are not predefined thus making it difficult to compare with other methods. In this paper, we propose two new Mean Squared Residue (MSR) based biclustering methods. The first method is a dual biclustering algorithm which finds (k × l)-bicluster with MSR using a greedy approach. The second method combines dual biclustering algorithm with quadratic programming. The dual biclustering algorithm reduces the size of the matrix, so that the quadratic program can find an optimal bicluster reasonably fast. We control bicluster overlapping by changing the penalty for reusing cells in biclusters. The average MSR in [4] biclusterings for yeast is almost the same as for the proposed dual biclustering while the median MSR is 1.5 times larger thus implying that the quadratic program finds much better smaller biclusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Coherence Measure for Discovering Scaling Biclusters from Gene Expression Data

Biclustering methods are used to identify a subset of genes that are co-regulated in a subset of experimental conditions in microarray gene expression data. Many biclustering algorithms rely on optimizing mean squared residue to discover biclusters from a gene expression dataset. Recently it has been proved that mean squared residue is only good in capturing constant and shifting biclusters. Ho...

متن کامل

Biclustering of Gene Expression Data using a Two - Phase Method

Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. Many biclu...

متن کامل

Shifting and scaling patterns from gene expression data

MOTIVATION During the last years, the discovering of biclusters in data is becoming more and more popular. Biclustering aims at extracting a set of clusters, each of which might use a different subset of attributes. Therefore, it is clear that the usefulness of biclustering techniques is beyond the traditional clustering techniques, especially when datasets present high or very high dimensional...

متن کامل

Biclustering of Expression Data

An efficient node-deletion algorithm is introduced to find submatrices in expression data that have low mean squared residue scores and it is shown to perform well in finding co-regulation patterns in yeast and human. This introduces "biclustering", or simultaneous clustering of both genes and conditions, to knowledge discovery from expression data. This approach overcomes some problems associa...

متن کامل

Random walk biclustering for microarray data

A biclustering algorithm, based on a greedy technique and enriched with a local search strategy to escape poor local minima, is proposed. The algorithm starts with an initial random solution and searches for a locally optimal solution by successive transformations that improve a gain function. The gain function combines the mean squared residue, the row variance, and the size of the bicluster. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Mean Squared Residue Based Biclustering Algorithms

نویسندگان

چکیده

منابع مشابه

A Novel Coherence Measure for Discovering Scaling Biclusters from Gene Expression Data

Biclustering of Gene Expression Data using a Two - Phase Method

Shifting and scaling patterns from gene expression data

Biclustering of Expression Data

Random walk biclustering for microarray data

عنوان ژورنال:

اشتراک گذاری